Search CORE

7 research outputs found

Towards Automatic and Adaptive Optimizations of MPI Collective Operations

Author: Pjesivac-Grbovic Jelena
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2007
Field of study

Message passing is one of the most commonly used paradigms of parallel programming. Message Passing Interface, MPI, is a standard used in scientific and high-performance computing. Collective operations are a subset of MPI standard that deals with processes synchronization, data exchange and computation among a group of processes. The collective operations are commonly used and can be application performance bottleneck. The performance of collective operations depends on many factors, some of which are the input parameters (e.g., communicator and message size); system characteristics (e.g., interconnect type); the application computation and communication pattern; and internal algorithm parameters (e.g., internal segment size). We refer to an algorithm and its internal parameters as a method. The goal of this dissertation is a performance improvement of MPI collective operations and applications that use them. In our framework, during a collective call, a system-specific decision function is invoked to select the most appropriate method for the particular collective instance. This dissertation focuses on automatic techniques for system-specific decision function generation. Our approach takes the following steps: first, we collect method performance information on the system of interest; second, we analyze this information using parallel communication models, graphical encoding methods, and decision trees; third, based on the previous step, we automatically generate the system-specific decision function to be used at run-time. In situation when a detailed performance measurement is not feasible, method performance models can be used to supplement the measured method performance information. We build and evaluate parallel communication models of 35 different collective algorithms. These models are built on top of the three commonly used point-to-point communication models, Hockney, LogGP, and PLogP.We use the method performance information on a system to build quadtrees and C4.5 decision trees of variable sizes and accuracies. The collective method selection functions are then generated automatically from these trees. Our experiments show that quadtrees of three or four levels are often enough to approximate experimentally optimal decision with a small mean performance penalty (less than 10%). The C4.5 decision trees are even more accurate (with mean performance penalty of less than 5%). The size and accuracy of C4.5 decision trees can be further improved with use of appropriate composite attributes (such as “total message size”, or “even communicator size”.) Finally, we apply these techniques to tune the collective operations on the Grig cluster at the University of Tennessee and to improve an application performance on the Cray XT4 system at Oak Ridge National Laboratory. The tuned collective is able to achieve more than 40% mean performance improvement over the native broadcast implementation. Using the platform-specific reduce on Cray XT4 lead to 10% improvement in the overall application performance. Our results show that the methods we explored are both applicable and effective for the system-specific optimizations of collective operations and are a right step toward automatically tunable, adaptive, MPI collectives

University of Tennessee, Knoxville: Trace

Scalable fault tolerant MPI: extending the recovery algorithm,&quot; Euro PVM/MPI

Author: George Bosilca
Graham E. Fagg
Jack J. Dongarra
Jelena Pjesivac-grbovic
Thara Angskun
Publication venue
Publication date: 01/01/2005
Field of study

Abstract. Fault Tolerant MPI (FT-MPI)[6] was designed as a solution to allow applications different methods to handle process failures beyond simple check-point restart schemes. The initial implementation of FT-MPI included a robust heavy weight system state recovery algorithm that was designed to manage the membership of MPI communicators during multiple failures. The algorithm and its implementation although robust, was very conservative and this effected its scalability on both very large clusters as well as on distributed systems. This paper details the FT-MPI recovery algorithm and our initial experiments with new recovery algorithms that are aimed at being both scalable and latency tolerant. Our conclusions shows that the use of both topology aware collective communication and distributed consensus algorithms together produce the best results.

CiteSeerX

The University of Manchester - Institutional Repository

Flexible collective communication tuning architecture applied to open MPI

Author: Emmanuel Jeannot
George Bosilca
Graham E. Fagg
Jack J. Dongarra
Jelena Pjesivac-grbovic
Publication venue
Publication date
Field of study

Abstract. Collective communications are invaluable to modern high performance applications, although most users of these communication patterns do not always want to know their inner most working. The implementation of the collectives are often left to the middle-ware developer such as those providing an MPI library. As many of these libraries are designed to be both generic and portable the MPI developers commonly offer internal tuning options suitable only for knowledgeable users that allow some level of customization. The work presented in this paper aims not only to provide a very efficient set of collective operations for use with the Open MPI implementation but also to make the control and tuning of them straightforward and flexible. Additionally this paper demonstrates a novel example of the proposed frameworks flexibility, by dynamically tuning a MPI Alltoallv algorithm during runtime.

CiteSeerX

SEMANTICS, DESIGN AND APPLICATIONS FOR HIGH PERFORMANCE COMPUTING

Author: Edgar Gabriel
George Bosilca
Graham E. Fagg
Jack J. Dongarra
Jelena Pjesivac-Grbovic
Thara Angskun
Zizhong Chen
Publication venue
Publication date: 01/01/2004
Field of study

With increasing numbers of processors on current machines, the probability for node or link failures is also increasing. Therefore, application-level fault tolerance is becoming more of an important issue for both end-users and the institutions running the machines. In this paper we present the semantics of a fault-tolerant version of the message passing interface (MPI), the de-facto standard for communication in scientific applications, which gives applications the possibility to recover from a node or link error and continue execution in a well-defined way. We present the architecture of fault-tolerant MPI, an implementation of MPI using the semantics presented above as well as benchmark results with various applications. An example of a fault-tolerant parallel equation solver, performance results as well as the time for recovering from a process failure are furthermore detailed

CiteSeerX

Self adapting numerical software (SANS) effort

Author: Erika Fuentes
George Bosilca
Graham E. Fagg
Haihang You
Jack Dongarra
Jelena Pjesivac-grbovic
Julien Langou
Keith Seymour
Piotr Luszczek
Sathish S. Vadhiyar
Victor Eijkhout
Zizhong Chen
Publication venue
Publication date: 01/01/2006
Field of study

The challenge for the development of next generation software is the successful management of the complex computational environment while delivering to the scientist the full power of flexible compositions of the available algorithmic alternatives. Self-Adapting Numerical Software (SANS) systems are intended to meet this significant challenge. The process of arriving at an efficient numerical solution of problems in computational science involves numerous decisions by a numerical expert. Attempts to automate such decisions distinguish three levels: • Algorithmic decision; • Management of the parallel environment; • Processor-specific tuning of kernels. Additionally, at any of these levels we can decide to rearrange the user’s data. In this paper we look at a number of efforts at the University of Tennessee that are investigating these areas.

CiteSeerX

Open Access Repository of IISc Research Publications

The University of Manchester - Institutional Repository

A Multiscale Model for Avascular Tumor Growth

Author: Alarcon
Alarcon
Beysens
Borkenstein
Boucher
Boucher
Bourrat-Floeck
Brown
Casciari
Charles Cantrell
Chen
Duguay
Freyer
Freyer
Freyer
Freyer
Freyer
Freyer
Freyer
Freyer
Glazier
Groebe
Groebe
Gutmann
Helmlinger
Izaguirre
Jackson
James P. Freyer
Jelena Pjesivac-Grbovic
Jiang
Klaunig
Kunz-Schughart
Landry
Landry
LaRue
LaRue
Mansury
Marusic
Marusic
Mueller-Klieser
Mueller-Klieser
Mueller-Klieser
Osada
Sarntinoranont
Sherrat
Steinberg
Stott
Sutherland
Vaupel
Walenta
Yi Jiang
Publication venue: Biophysical Society
Publication date: 01/01/2005
Field of study

The desire to understand tumor complexity has given rise to mathematical models to describe the tumor microenvironment. We present a new mathematical model for avascular tumor growth and development that spans three distinct scales. At the cellular level, a lattice Monte Carlo model describes cellular dynamics (proliferation, adhesion, and viability). At the subcellular level, a Boolean network regulates the expression of proteins that control the cell cycle. At the extracellular level, reaction-diffusion equations describe the chemical dynamics (nutrient, waste, growth promoter, and inhibitor concentrations). Data from experiments with multicellular spheroids were used to determine the parameters of the simulations. Starting with a single tumor cell, this model produces an avascular tumor that quantitatively mimics experimental measurements in multicellular spheroids. Based on the simulations, we predict: 1), the microenvironmental conditions required for tumor cell survival; and 2), growth promoters and inhibitors have diffusion coefficients in the range between 10(−6) and 10(−7) cm(2)/h, corresponding to molecules of size 80–90 kDa. Using the same parameters, the model also accurately predicts spheroid growth curves under different external nutrient supply conditions

CiteSeerX

Elsevier - Publisher Connector

Crossref

PubMed Central